Associations between fully-automated, 3D-based functional analysis of the left atrium and classification schemes in atrial fibrillation

Background Atrial fibrillation (AF) has been linked to left atrial (LA) enlargement. Whereas most studies focused on 2D-based estimation of static LA volume (LAV), we used a fully-automatic convolutional neural network (CNN) for time-resolved (CINE) volumetry of the whole LA on cardiac MRI (cMRI). Aim was to investigate associations between functional parameters from fully-automated, 3D-based analysis of the LA and current classification schemes in AF. Methods We retrospectively analyzed consecutive AF patients who underwent cMRI on 1.5T systems including a stack of oblique-axial CINE series covering the whole LA. The LA was automatically segmented by a validated CNN. In the resulting volume-time curves, maximum, minimum and LAV before atrial contraction were automatically identified. Active, passive and total LA emptying fractions (LAEF) were calculated and compared to clinical classifications (AF Burden score (AFBS), increased stroke risk (CHA2DS2VASc≥2), AF type (paroxysmal/persistent), EHRA score, and AF risk factors). Moreover, multivariable linear regression models (mLRM) were used to identify associations with AF risk factors. Results Overall, 102 patients (age 61±9 years, 17% female) were analyzed. Active LAEF (LAEF_active) decreased significantly with an increase of AFBS (minimal: 44.0%, mild: 36.2%, moderate: 31.7%, severe: 20.8%, p<0.003) which was primarily caused by an increase of minimum LAV. Likewise, LAEF_active was lower in patients with increased stroke risk (30.7% vs. 38.9%, p = 0.002). AF type and EHRA score did not show significant differences between groups. In mLRM, a decrease of LAEF_active was associated with higher age (per year: -0.3%, p = 0.02), higher AFBS (per category: -4.2%, p<0.03) and heart failure (-12.1%, p<0.04). Conclusions Fully-automatic morphometry of the whole LA derived from cMRI showed significant relationships between LAEF_active with increased stroke risk and severity of AFBS. Furthermore, higher age, higher AFBS and presence of heart failure were independent predictors of reduced LAEF_active, indicating its potential usefulness as an imaging biomarker.


Introduction
Atrial fibrillation (AF) is the most common arrhythmic heart disease affecting about 1% of the general population and more than 12 million people in the US are expected to have AF by 2030 [1,2]. This is of clinical importance due to the association of AF with an increased stroke risk, reduction of quality of life and cognitive decline [3][4][5]. Several risk factors are associated with the development of AF, including age, AF type, AF Burden and cardiovascular (CV) risk factors such as arterial hypertension (HT), diabetes mellitus, and heart failure (HF) [6]. Furthermore, secondary conditions like hyperthyroidism or lifestyle factors may precipitate AF [7,8].
Multiple studies have shown that remodeling of the left atrium (LA) with AF leads to its enlargement [9][10][11]. However, LA enlargement might not present in some specific AF etiologies [12,13]. In consequence, current guidelines recommend the assessment of LA size for AF patients, commonly based on the static dimension in parasternal long axis, LA volume (LAV), or indexed LAV (LAVi) [14]. Volumetric parameters are mainly calculated from 2-dimensional data using volumetric approximations such as the area-length method. Besides static LAV assessment, LA function is another important parameter for its characterization. It can be measured from time-resolved (CINE) imaging as strain or LA emptying fractions (LAEF) [15]. The latter was first established based on echocardiography. Recently, preserved LAEF was reported to predict the outcome after HF using cardiac MRI (cMRI) [16]. Moreover, LA function was identified as predictor of CV events and the outcome after myocardial infarction [17,18]. However, LAEF assessment can be time consuming and requires specific knowledge to preprocess the imaging data.
Artificial intelligence (AI) and specifically deep learning (DL) have proven to support performing LA segmentations on single time point of the cardiac cycle or on post-contrast cMRI with excellent results [19,20]. DL-based segmentation of the whole LA over the cardiac cycle based on CINE cMRI has also been validated recently for biplane and 3D-based assessment [21]. But this technique has not been applied in a patient cohort to investigate associations with clinical parameters.
The aims of this study were twofold. First, to assess the application of a fully-automatic approach to quantify LA functional parameters based on cMRI in a cohort of AF patients. Second, to investigate its associations with established and novel clinical classification schemes of AF and CV risk factors.

Study population
The study was approved by the local ethics committee (Ethikkommission Nordwest-und Zentralschweiz (EKNZ)) and complied with the Declaration of Helsinki. Patients gave written informed consent. We considered 181 consecutive patients with CINE MRI from our prospective AF cohort (SWISS AF PVI, Clinical Trial registry) retrospectively. Exclusion criteria were any prior LA ablation and AF during the cMRI. In addition, we performed a comprehensive analysis on all patients, excluding only patients with prior LA ablation. From study records, we extracted CV risk factors (HF, HT, diabetes, renal failure) and left ventricular ejection fraction (LVEF, based on echocardiography).

Image acquisition
cMRI scans were performed on 1.5T MRI systems (Siemens Avanto or Espree, Siemens Healthineers, Germany). Retrospectively ECG-gated, balanced steady-state free precision CINE sequences in oblique-axial orientation (planning was based on a 4CH scout) were acquired covering the whole LA in up to 12 axial stacks during breathhold (TE: 1.1-1.2ms, flip angle: 58-64˚, in-plane resolution: 192x156mm, spatial resolution: 1.8-2.0 x 1.8-2.0mm, slice thickness: 6mm, no section gap) with 25 frames per cardiac cycle. No further long-or shortaxis views were included in the study protocol.

Convolutional neural network
The network was described and validated in detail elsewhere [21]. It was built to segment the LA using the area-length method (2D) and on multiple oblique-axial CINE sequences in 4CH orientation (3D); the latter was used in this study. Briefly, manual segmentations were performed by M.P. and S.K. using the oblique-axial CINE stack covering the whole LA in 50 cases (Segment v2.2 R6435, Medviso, Sweden) [22]. These time-resolved segmentations were exported as binary masks, each with 25 images / time points, resulting in 1,250 volumes. These segmentations served as the training dataset for a deep convolutional neural network (CNN), based on a U-Net architecture [23]. An anisotropic, slightly altered version of the original 3D U-Net was implemented for the segmentation of the LA with three resolution layers formed by two pooling and upsampling layers [24]. After training and validation, the resulting network was used to predict the segmentation for all cases. An example case with segmentation at fiducial time points is shown in Fig 1. LA functional assessment from whole cardiac cycle LA volume was calculated from the segmented 3D dataset by the sum of all identified LA voxels. This was performed for every time point of the cardiac cycle to create the volume-time curve for the LA (Fig 1). LAV_max and LAV_min were automatically identified. Additional fiducial points were the volume before atrial contraction (LAV_preA) which is defined as the time point just before the atrial contraction assists in emptying the LA [25]. Furthermore, the minimal volume between LAV_max and LAV_preA (LAV_min2) was identified automatically, primarily in order to guide the custom-written computational code (in MATLAB, Math-Works, USA) to perform automatic detection of fiducial points. If all fiducial points were identified, the patient was included in the main analysis. Based on available fiducial points, the total (LAEF_total), active (LAEF_active) and passive LAEF (LAEF_passive) were automatically calculated (Fig 1) [25,26]: Patients without the fiducial point of LAV_preA due to AF during MRI acquisition were included in the comprehensive analysis. In addition to the absolute volumes, we calculated the respective indexed LAV using the Body Surface Area (BSA) according to the Mosteller formula resulting in maximum (LAVi_max), minimum indexed LA volume (LAVi_min), indexed volume before atrial contraction (LAVi_preA), and indexed minimum volume between LAV_max and LAV_preA (LAVi_min2).
An overview of the entire automatic workflow pipeline can be found in Fig 2.

Classification of atrial fibrillation
Standard classification of AF was performed based on the presentation, duration and spontaneous termination of the AF episodes resulting in the class of paroxysmal, and persistent AF [14]. In the recently proposed "4S-AF" scheme, the stroke risk (based on the CHA 2 DS 2 VASc score), symptom severity (EHRA symptom score), and severity of AF Burden were proposed for the structured characterization of AF [14]. Stroke risk was stratified for low-risk (CHA 2 DS 2 VASc�1) and increased risk (CHA 2 DS 2 VASc�2). Classification of the EHRA score was performed as follows: class 1 (no symptoms), class 2 (mild symptoms), class 3 (severe symptoms), class 4 (disabling symptoms). The severity of AF was characterized based on the established classification of paroxysmal and persistent AF and additionally, using an established, symptomatic burden-based classification (AF Burden score (AFBS)). AFBS is a structured clinical assessment to evaluate frequency and duration of AF episodes as well as number of electrical cardioversions [27]. The sum of the frequency [daily (5 points), two or more days

Statistical analysis
Baseline characteristics of patients are presented as the count and percentage for categorical variables. For comparison of the continuous variables, Shapiro test for normality was performed, followed by either t-test (two groups) or one way ANOVA (more than two groups) in case of normal distribution. If data was not normally distributed, we performed Kruskal Wallis test. If there were more than two groups, posthoc Bonferroni correction was performed. Continuous variables were reported as mean ± standard deviation (SD) for normal distribution or median ± interquartile range (IQR) for non-normal distribution. Discrete variables were compared using Fisher's exact test.
We used a multivariable linear regression model (mLRM) in a step-wise forward approach, corrected for age, BMI and sex, to investigate the relationship of functional LA parameters with following clinical parameters: AF type, AFBS, EHRA score, CHA 2 DS 2 VASc, diagnosis of HT, diabetes, HF, renal failure, and LVEF. Parameters with p-value < 0.1 were considered for the next step in the forward approach. Results of the univariable linear regression models (uLRM) are included in the Supplemental Material. Statistical analyses were performed using SPSS (IBM, USA) and a p-value < 0.05 was considered statistically significant.

Study cohort
We finally analyzed 102 patients with automatically calculated LAEF_total, LAEF_active and LAEF_passive. The segmentation algorithm failed in three patients, in nine patients not all fiducial points could be identified due to multiple extra-systoles during MRI acquisition ( Fig  3). The baseline characteristics can be found in Table 1.

Association of functional parameters with stroke risk based on CHA 2 DS 2 VASc
The three functional parameters LAEF_total, LAEF_active, and LAEF_passive were all significantly lower for increased stroke risk (CHA 2 DS 2 VASc �2; p<0.001, p = 0.002 and p<0.001, respectively; Fig 4, Table 2). This was based upon significant increases in minimum LAV parameters (LAV_min, LAV_preA, LAVi_min, LAVi_preA). A detailed CHA 2 DS 2 VASc comparison for all categories can be found in the S1 Table.

Association of functional parameters with symptom severity & atrial fibrillation type
No significant differences was observed between EHRA score or AF type (paroxysmal or persistent AF) and any of the LA parameters (S2 and S3 Tables).

Prediction of functional left atrial parameters
Multivariable linear regression models were computed for total, active and passive LAEF in relation to the classification of AF based (stroke risk (CHA 2 DS 2 VASc score), symptom severity   (Table 4). The effects of CHA 2 DS 2 VASc, HF and HT in the univariable models did not prevail in the multivariable model.

Automated functional analysis of left atrium in 3D
We note that the classical differentiation between paroxysmal and persistent AF (AF type) and EHRA score did not have an impact on any parameter for both uLRM or mLRM.

Comprehensive analysis
The baseline characteristics from the comprehensive analysis of the entire cohort (n = 151 patients), including the patients in AF during MR acquisition, can be found in the Supplemental Material. Due to the 49 patients with AF during the MRI acquisition without active LA contraction, we could only investigate LAV_max, LAV_min, LAEF_total as well as LAVi_max and LAVi_min in all patients. In this cohort, we observed more patients with persistent AF and also overall higher AFBS (see S10 Table). AFBS and stroke risk based on CHA 2 DS 2 VASc were lower in patients with lower LAV_min (p<0.001, p = 0.01; respectively), lower LAVi_min (p<0.001, p<0.001; respectively) and higher total LAEF (p<0.001, p<0.001; respectively); https://doi.org/10.1371/journal.pone.0272011.g005 S11 and S12 Tables. On opposite, LAV_max was not significantly different for AFBS and CHA 2 DS 2 VASc-based stroke risk. We also observed that patients with persistent AF had higher minimum and maximum volumes (both indexed and absolute) and a lower total LAEF; S13 Table. Lastly, EHRA score did not show significant differences; S14 Table.

Discussion
In this study, we investigated LA functional parameters using a fully-automatic, 3D-based volumetric assessment of the whole cardiac cycle of the LA with a comprehensive clinical AF classification scheme. The main observations of our study were: 1. The application of the previously validated, CNN-based segmentation algorithm was feasible for 3D segmentation in all except three patients. Overall, automatic detection of fiducial points from the resulting volume-time curve was possible in 102/114 patients (92%), allowing to automatically differentiate between total, active and passive LAEF.
2. We identified strong association of the LA function with AFBS and stroke risk based on CHA 2 DS 2 VASc. In detail, LAEF_total and LAEF_active both showed a significant decline with increasing AFBS. In addition, patients with a lower total, active and passive LAEF presented with an increased stroke risk according to the CHA 2 DS 2 VASc. For all those associations, the increase of LAV_min rather than an increase of LAV_max (for LAEF_total) or LAV_preA (for LAEF_active) seemed to be the underlying mechanism. Results from the entire cohort including patients in AF during the MRI and therefore without active LA contraction did not differ substantially from these observations.
3. Multivariable regression analyses, corrected for age, BMI and sex, revealed relationships between total, active and passive LAEF with established AF risk factors. Especially higher age was an independent predictor of a reduction of all three LAEF parameters in our AF cohort. In addition, a higher AFBS was an independent predictor of reduced LAEF_active, HT was an independent predictor of reduced LAEF_total, and HF independently predicted reduced active and total LAEF.

Applicability of the approach
LA size measured as maximum diameter in parasternal long axis by echocardiography at ventricular end-systole is an established parameter and was identified as predictor for AF in the Framingham Study [28]. For LAV assessment, superiority of cMRI over echocardiography was shown in the past while other techniques such as 3D mapping system also allow LA volumetry [29][30][31]. The vast majority of cMRI studies, however, used biplanar-based calculation for volumetric analysis of the LA with a focus on maximum (indexed) volume [16,17,32,33]. Longer acquisition times for 3D coverage of the LA and the time-consuming manual LA segmentation might explain the limited number of cMRI studies analyzing LA function based on 3D datasets in the past [29,34,35]. Recently, there were reports of AI tools segmenting the LA on CINE series for biplane or short axis-based assessment but not for oblique-axial orientation [36][37][38].
To overcome this limitation, we recently validated a CNN-based algorithm for LA segmentation over the whole cardiac cycle. This algorithm has two elements, one for biplane-based LA segmentation and one for 3D-based LA segmentation on oblique-axial CINE series [21]; the latter was used in this study. Volume-time curves were generated from the segmentations and characteristic fiducial points automatically identified, which was possible in 102 patients. The reason for failed identification of fiducial points were rhythm irregularities (numerous extrasystoles) during image acquisition. Real-time CINE imaging could generally be an option to overcome arrhythmias, however, these were not acquired in this study. Due to low image resolution, volumetric assessment and therefore LAEF assessment could be compromised by realtime imaging.
Overall, automatic calculation of LAEF_total, LAEF_active and LAEF_passive could be achieved in 102 patients with sinus-rhythm during cMRI from a standard clinical protocol. This showed applicability of our comprehensive approach and represents an example of a fully automated, DL-supported workflow supporting a complex, cMRI-based analysis. However, in patients with AF during MRI acquisition, assessment of LAEF_active is not possible. Instead, our comprehensive analysis suggested that LAEF_total might serve as an alternative biomarker in this cohort.

LA volumes and LAEF as measures for LA size and function and their clinical implications
Single time point-based analysis of LA volume is common in clinical practice since LA enlargement was linked to multiple CV diseases [39]. However, the potential superiority of LAEF as a functional parameter over a static LAV alone was proposed by Hoit who linked it to the importance of minimum LAV [39]. LAEF combines distinct measures of LAV_max, LAV_min and/or LAV_preA in one parameter and, therefore, strengthens the advantage of CINE analysis over single time point assessment. LAEF was already associated with silent strokes, HF and cardiomyopathies [15,32,33]. In patients with diagnosis of AF, Sievers et al. reported a mean total LAEF of 49.8% (in sinus rhythm) based on 3D assessment, respectively [34]; these results match our findings well (48.6%). Wandelt et al. performed manual 3D segmentations based on axial CINE series in an AF patient cohort. The reported mean values for LAEF_total, LAEF_active and LAEF_passive of 47.9%, 35.6% and 19.2%, respectively, were similar to our results (48.6%, 35.2% and 21.7%, respectively) [40].
LA function and AF burden. AF Burden is an important parameter to assess and classify AF patients [14]. In addition to the categorization in paroxysmal and persistent AF, we included AFBS, which characterizes AF Burden in more detail by combining frequency and duration of AF episodes as well as the number of cardioversions. These characteristics proved to be able to predict AF recurrence after first and repeated PVI better than the classic AF characterization [27]. In line with this observation, the AFBS, but not the conventional classification in paroxysmal and persistent AF, showed a significant relationship to active and total LAEF in this study. Accordingly, mLRM results showed a significant reduction of LAEF_active by -4.2% with each increase of AFBS category. These results were driven by a bigger difference of minimum volume rather than LAV_max for LAEF_total or LAV_preA for LAEF_active. The correlation of LAEF_active reduction with AFBS increase might be explained by the fact that a "healthier" (less remodeled) LA was able to actively pump more blood from the LA into the left ventricle at the end of ventricular diastole, resulting in a lower minimum LA volume. Other studies observed associations of decreased LAEF_active with non-obstructive, hypertrophic cardiomyopathy prior to LA enlargement and adverse effects and death in hypertensive patients [15,41]. Our data suggested that LAEF_total and LAEF_active might play in addition an important role in assessment of AF patients, at least if they are in sinus rhythm during MRI.
LA function and stroke risk based on CHA 2 DS 2 VASc. Patients with AF suffer a 5-fold increased risk of stroke caused by thrombus formations in the LA and LA appendage [11]. The CHA 2 DS 2 VASc score as a measure of the stroke risk in AF patients showed another important association with LA function in our study: In patients with an increased risk for stroke (based on a CHA 2 DS 2 VASc � 2), total, active and passive LAEF were significantly lower compared to low-risk patients (CHA 2 DS 2 VASc � 1). This indicated that a reduced LA function was associated with an increased risk for stroke in AF patients. In fact, all 7 patients with a reported stroke in our cohort had a CHA 2 DS 2 VASc � 2. This is in accordance with the current literature where a reduction of LAEF_total was linked to cerebrovascular events or in patients suffering a stroke [33,42]. In accordance with our observation, Leung et al. stated that LA function could provide additional risk stratification for stroke in patient with a high CHA 2 DS 2-VASc score of �2 [43].
LA function and other AF risk factors. In a separate analysis, we furthermore identified significant relationships between established AF risk factors and the three LAEF parameters (LAEF_total, LAEF_active, LAEF_passive). In detail, the mLRM (corrected for age, BMI and sex), revealed a statistically significant, negative correlation of age with all three LAEF parameters. This is in accordance to the known importance of age as a risk factor for AF [44]. Arterial hypertension and HF which are other known, major AF risk factors, were also independent predictors of reduced total and active LAEF [39,41]. Opposite, LVEF was (besides age) the only parameter to independently predictor a higher LAEF_passive. This in line with the fact that LAEF_passive is mainly determined by LV functionality [39,45]. Of note, LAEF_passive cannot accurately assess the conduit function of the LA because blood can pass through the LA directly from the pulmonary veins without changing the current LA volume [39]. In summary, age, HT, HF and to a certain amount LVEF, are relevant, independent predictors of the LA function.

Limitations
This was a single-center, retrospectively analyzed study from a prospective cohort, therefore generalizability of our results might be limited. AFBS is partially a subjective score, patients without or with milder symptoms could be underrepresented.
When we planned this study, we focused on LA volumes assessment in 3D. Therefore, we only have oblique-axial CINE series available in this cohort and could not compare our parameters to biplane assessment of LA function which was a tradeoff to allow for 3D image acquisition.
The applied, previously established CNN for LA segmentation was built in-house on imaging data from one vendor. While openly accessible, the transferability on imaging studies from other institutions cannot be guaranteed.
We investigated only patients whose volume-time curves had all fiducial points available. This restricted the generalizability of the results to patients in SR at the time of cMRI which might have caused a bias regarding patient selection, for example this could have limited the number of patients with persistent AF (28% of all patients). To address this limitation, we performed the comprehensive analysis of the entire cohort, including patients with AF during the acquisition. Furthermore, we investigated the risk for stroke based on the CHA 2 DS 2 VASc and not based on the clinical event of a stroke. The CHA 2 DS 2 VASc was also not homogenously distributed in our rather young patient cohort.
We did not perform continuous rhythm monitoring before the MRI; therefore, a reduced LA function might as well be a result of LA stunning due to previous spontaneous termination of AF. Furthermore, short paroxysmal episodes of AF could have happened during MRI and could have led to a missing atrial contraction, resulting in exclusion of these patients. Finally, an adjustment for multiple testing was not performed due to the exploratory nature of the comparison.

Conclusions
Our study showed that the fully-automatic characterization of LA function from 3D-based CINE cMRI is feasible in a clinical cohort of patients with diagnosis of AF. It revealed significant associations between LA functional parameters, especially active LAEF, with increased stroke risk (based on the CHA 2 DS 2 VASc score) and the severity of the AF Burden. This indicates potential usefulness of active LAEF as an imaging biomarker, though its effect on clinical endpoints such as recurrence of AF, hospitalization, stroke, or mortality, requires evaluation in further studies.
Supporting information S1  Table. Comprehensive analysis, AF Burden score. Absolute and indexed minimum LA volume and LAEF_total were significantly different between groups while LAV_max was not. (DOCX) S12 Table. Comprehensive analysis, CHA 2 DS 2 VASc based stroke risk. Absolute and indexed minimum LA volume and LAEF_total were significantly different between groups while LAV_max was not. (DOCX) S13 Table. Comprehensive analysis, AF type. LA volumes (minimum and maximum) were higher and LAEF_total lower in patients with persistent AF. (DOCX) S14 Table. Comprehensive analysis, EHRA score. We did not find significant differences between EHRA Score and LA measures. (DOCX) S1 Data. Minimal data set. (XLSX)